2 Notes on Classes with Vapnik-Chervonenkis Dimension 1
نویسنده
چکیده
The Vapnik-Chervonenkis dimension is a combinatorial parameter that reflects the ”complexity” of a set of sets (a.k.a. concept classes). It has been introduced by Vapnik and Chervonenkis in their seminal paper [1] and has since found many applications, most notably in machine learning theory and in computational geometry. Arguably the most influential consequence of the VC analysis is the fundamental theorem of statistical machine learning, stating that a concept class is learnable (in some precise sense) if and only if its VC-dimension is finite. Furthermore, for such classes a most simple learning rule empirical risk minimization (ERM) is guaranteed to succeed. The simplest non-trivial structures, in terms of the VC-dimension, are the classes (i.e., sets of subsets) for which that dimension is 1. In this note we show a couple of curious results concerning such classes. The first result shows that such classes share a very simple structure, and, as a corollary, the labeling information contained in any sample labeled by such a class can be compressed into a single instance. The second result shows that due to some subtle measurability issues, in spite of the above mentioned fundamental theorem, there are classes of dimension 1 for which an ERM learning rule fails miserably. 1 Preliminaries: The Vapnik-Chervonenkis dimension Definition 1 ([1]). Let X be any set, let 2 denote its power set the set of all subsets of X . A concept class is a set of subsets of X , H ⊆ 2 . We will identify I have discovered the results presented in this note more than 20 year ago, and have mentioned them in public talks as well as private communications over the years. However, this is the first time I have written them up for publication.
منابع مشابه
Sign rank versus Vapnik-Chervonenkis dimension
This work studies the maximum possible sign rank of sign (N ×N)-matrices with a given Vapnik-Chervonenkis dimension d. For d = 1, this maximum is three. For d = 2, this maximum is Θ̃(N). For d > 2, similar but slightly less accurate statements hold. The lower bounds improve on previous ones by Ben-David et al., and the upper bounds are novel. The lower bounds are obtained by probabilistic constr...
متن کاملError Bounds for Real Function Classes Based on Discretized Vapnik-Chervonenkis Dimensions
The Vapnik-Chervonenkis (VC) dimension plays an important role in statistical learning theory. In this paper, we propose the discretized VC dimension obtained by discretizing the range of a real function class. Then, we point out that Sauer’s Lemma is valid for the discretized VC dimension. We group the real function classes having the infinite VC dimension into four categories by using the dis...
متن کاملVapnik-chervonenkis Dimension 1 Vapnik-chervonenkis Dimension
Valiant’s theorem from the previous lecture is meaningless for infinite hypothesis classes, or even classes with more than exponential size. In 1968, Vladimir Vapnik and Alexey Chervonenkis wrote a very original and influential paper (in Russian) [5, 6] which allows us to estimate the sample complexity for infinite hypothesis classes too. The idea is that the size of the hypothesis class is a p...
متن کاملOn the VC-dimension and Boolean functions with long runs
The Vapnik-Chervonenkis (VC) dimension and the Sauer-Shelah lemma have found applications in numerous areas including set theory, combinatorial geometry, graph theory and statistical learning theory. Estimation of the complexity of discrete structures associated with the search space of algorithms often amounts to estimating the cardinality of a simpler class which is effectively induced by som...
متن کاملComplexity of VC-classes of sequences with long repetitive runs
The Vapnik-Chervonenkis (VC) dimension (also known as the trace number) and the Sauer-Shelah lemma have found applications in numerous areas including set theory, combinatorial geometry, graph theory and statistical learning theory. Estimation of the complexity of discrete structures associated with the search space of algorithms often amounts to estimating the cardinality of a simpler class wh...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1507.05307 شماره
صفحات -
تاریخ انتشار 2015